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1 Introduction 


Albert Einstein famously commented, “What really interests me is whether 
God had any choice in the creation of the world” [1]. Within the context of 
local Lagrangian held theory the answer seems to be that powerful restric¬ 
tions exist but some freedom still remains regarding the choice of dynamical 
variables and symmetries. By far the greatest restriction is the obstacle to 
including higher time derivatives which is implied by Ostrogradsky’s con¬ 
struction of a canonical formalism for nondegenerate higher derivative La- 
grangians |2]. 

Mikhail Vasilevich Ostrogradsky lived from 1801 to 1862. He was born to 
a poor family of Ukrainian ethnicity in Pashennaya, which is now in Ukraine 
but was at that time part of the vast Russian Empire. These were mo¬ 
mentous years for Russia, bracketed by its rise to become the predominant 
military power during the Napoleonic Wars, and its humiliating collapse be¬ 
fore Britain and France during the Crimean War. Russian society was riven 
by the struggle between the forces of reaction and reform. Indeed, Ostro¬ 
gradsky was denied his doctorate at the University of Kharkov because the 
mathematics professor who had examined him was considered insufficiently 
religious. Later on, Ostrogradsky was placed under police surveillance at the 
start of his career in the Imperial Russian capital of St. Petersburg [3]. 

Ostrogradsky studied and worked in Paris from 1822 through 1827. He 
knew the leading French mathematicians of the time, including Cauchy, who 
paid off his debts and secured him a teaching job. In 1826 Ostrogradsky 
stated and proved the divergence theorem, which was later re-discovered by 
Gauss in the 1830’s. Ostrogradsky paid a much shorter visit to Paris in 1830. 
However, most of his professional life was spent in St. Petersburg where he 
was elected to the Imperial Academy of Sciences and played an important 
role in the teaching of mathematical sciences. Ostrogradsky wrote in French 
and Russian |3]. 

Ostrogradsky’s higher derivative generalization of Hamilton’s construc¬ 
tion was published in 1850 [2]. Ostrogradsky’s construction implies that 
there is a linear instability in the Hamiltonians associated with Lagrangians 
which depend upon more than one time derivative in such a way that the 
higher derivatives cannot be eliminated by partial integration. This is prob¬ 
ably why Newton was right to assume the laws of physics take the form of 
second differential equations when expressed in terms of fundamental dy¬ 
namical variables. 
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It might seem curious that Ostrogradsky did not appreciate the impor¬ 
tance of his construction to fundamental theory. However, one must recall 
that the researchers of his time were just beginning to make the connection 
between energy functionals and the concept of stability — which in those 
days meant the absence of growing perturbations. The notion of quantum 
fluctuations exploring all perturbations was decades away, and the key in¬ 
sight that all dynamics is described by interacting continuum held theories 
was even further in the future. 

Section presents Ostrogradsky’s construction in the context of point par¬ 
ticle whose position is x{t). Section 3 discusses the consequences of this 
result for fundamental theory. Sections 4 and 5 deal with quantization and 
degeneracy, respectively. Section 6 contains some concluding remarks. 


2 The Construction of Ostrogradsky 

This section presents Ostrogradsky’s construction. First, the usual case of 
a hrst derivative Lagrangian is reviewed to hx concepts and notation. Then 
the case of second derivatives is presented. The section closes with a review 
of the general case of N time derivatives. 


2.1 Hamilton’s Construction 


In the usual case of L = L{x, x), the Euler-Lagrange equation is. 


dL d dL 
dx dt dx 


( 1 ) 


The assumption that 7 ^ 0 is known as nondegeneracy. If the Lagrangian 
is nondegenerate one can write ([I]) in the form Newton assumed so long ago 
for the laws of physics. 


X = J^{x, x) 


x{t) = X{t,Xo,Xo) . 


( 2 ) 


From this form it is apparent that solutions depend upon two pieces of initial 
value data: Xq = x( 0 ) and Xq = i;( 0 ). 

The fact that solutions require two pieces of initial value data means that 
there must be two canonical coordinates, X and P. They are traditionally 
taken to be, 

dL 

X = X and P = — . (3) 

ox 
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The assumption of nondegeneracy implies one can invert the phase space 
transformation ([2]) to solve for x in terms of X and P. That is, there exists 
a velocity V(X, P) such that, 


dx 


= P . 


x=X 

x=V 


(4) 


The canonical Hamiltonian is obtained by Legendre transforming on x, 


H{X,P) = Px-L, 
= PV{X,P) 


X,V{X,P)') . 


(5) 

( 6 ) 


It is easy to check that the canonical evolution equations reproduce the in¬ 
verse phase space transformation (jTj) and the Euler-Lagrange equation (fT|), 

dx dP 

dL dL dV _ dL 
dx dx dX dx 


X . = + 

dP dP 

~ dx ^ dx 


= V, 


(7) 

( 8 ) 


dx dX 

This is the meaning of the statement, the Hamiltonian generates time evo¬ 
lution. When the Lagrangian has no explicit time dependence, H is also the 
associated conserved quantity. Hence it possesses the key properties physi¬ 
cists want for the energy, and is unique up to canonical transformations. 

A familiar example is the simple harmonic oscillator of mass m and fre¬ 
quency u whose Lagrangian is. 


L = -mxP 
2 


-L 2 2 
-muj X 


x{t) = xocos{ut) H—^sin(a;t) . 

to 


The equation of motion and its general solution are, 
x{t) = —uPx(t) 

The canonical variables for this system are, 

X = X and P = mx = 

And the Hamiltonian is. 


V{X,P) = 


P 


m 


H = 


p2 

2m 


-mu^X^ 


(9) 


( 10 ) 


( 11 ) 


( 12 ) 


Because it is quadratic in both X and P, the Hamiltonian H{X, P) is 
bounded below by zero. 
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2.2 Ostrogradsky’s Construction for Two Derivatives 

Now consider a system whose Lagrangian L(x, x) depends nondegenerately 
npon X. The Enler-Lagrange eqnation is, 


dL d dL 

dx dt dx dt'^ dx 


( 13 ) 


Now nondegeneracy means 7 ^ 0, which implies that the Enler-Lagrange 
eqnation flT^ can be cast in a form radically different from Newton’s, 


"x = J-{x, X, X, x) 


X{t) = X(t,Xo,Xo,Xo, x'o) ■ 


(14) 


Becanse solntions now depend npon fonr pieces of initial valne data there 
mnst be fonr canonical coordinates. Ostrogradsky’s choices for these are. 


= X 

X2 = x 


Pi 

P 2 


dL 

dx 

dL 

dx 


d dL 
dt dx 


( 15 ) 

( 16 ) 


The assnmption of nondegeneracy implies one can invert the phase space 
transformation fll51ll6p to solve for x in terms of Xi, X 2 and P 2 . That is, 
there exists an acceleration A{Xi, X 2 , P 2 ) snch that. 


dx ^=^1 

x^X2 

x=A 


P 2 . 


(17) 


Note that the acceleration A{Xi, X 2 , P 2 ) does nor depend npon Pi. The 
momentnm Pi is only needed for the third time derivative. 

Ostrogradsky’s Hamiltonian is obtained by Legendre transforming on x = 
x*^^) and X = x^^\ 


2 

P(Xi,X2,Pi,P2) = 5^P.x«-L, 

i=l 

= P1X2 + P2H(Xi, X2, P2) - L (Xi, X2, 7 l(Xi, X2, P2)) . 

The time evolntion eqnations are those snggested by the notation. 




dH 

m 


and 


P 


dH 
^ ■ 


( 18 ) 

( 19 ) 


( 20 ) 
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To check that they generate time evolution, note hrst that the evolution 
equation for Xi is, 


Xi = 


dH 


X2. 


( 21 ) 


Of course this reproduces the phase space transformation x = X 2 in (fT 6 |) . 
The evolution equation for X 2 similarly reproduces (fT7|) . 


^2 = 


dH 

W2 


■A. -\- P2 


dA 

W 2 


The phase space transformation -Pi = — 

equation for P 2 , 


dL dA 
dx dP 2 


= A . 


( 22 ) 


comes from the evolution 


dH_ dA dL dLdA_ dL 

dX 2 ^ ^ dX 2 dx dx dX 2 ^ dx ' 


(23) 


And the Euler-Lagrange equation (IT^ follows from the evolution equation 
for Fi, 

p _ _dH _ _ dA dL dL dA _ dL 
^ dXi ^ dXi dx dx dXi dx 

Hence Ostrogradsky’s Hamiltonian generates time evolution. When the La- 
grangian contains no explicit dependence upon time it is also the conserved 
Noether current. It is therefore the energy, again up to canonical transfor¬ 
mation. 

Ostrogradsky’s Hamiltonian flT^ is linear in the canonical momentum Pi, 
which means that no system of this form can be stable. In fact, there is not 
even any barrier to decay. Note the power and generality of the result; it 
applies to every Lagrangian L(x, x, x) which depends nondegenerately upon 
X, independent of the details. The only assumption is nondegeneracy, and 
that simply means one cannot eliminate x by partial integration. 

It is useful to consider a higher derivative example which depends upon 
a dimensionless parameter e that quantihes its deviation from the simple 
harmonic oscillator (P|), 


L = 


em 


m 


• ^ I • z 


mu) 9 

^ X 


The Euler-Lagrange equation and its general solution are. 


= —m 


"x + X + OJ X 




= (7+ cos(A:_|_f) -|- 5+ + C_ cos{k_t) + sm{kH) 


( 25 ) 


( 26 ) 

( 27 ) 


0 

x{t) 
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Here the two frequencies are, 


k± 


= UJ\ 


1 =F \/l-4e 

Ye 


(28) 


and the constants C± and S± are functions of the initial value data, 


= 


C. = 


k‘^XQ + XQ 

kl-kl 

k^xo + xo 


S+ = 

s. = 


k'ixo+xo 

k+{e_-kl) ’ 

k^xo+xo 


kl-kl ’ ■ k.^kl-kl) ■ 

For this model Ostrogradsky’s two conjugate momenta (1 1511 161) are. 


em... 

Pi = mx H-^ X 


P 2 = 


em , 


-X 






X = 


u‘^Pi-mu‘^X2 


X = A = 


em 

UJ^P2 

em 


(29) 

(30) 


(31) 

(32) 


The Hamiltonian can be expressed alternatively in terms of canonical vari¬ 
ables, conhguration space variables, or the constants C± and S±, 


H 


P 1 X 2 - 


^ p 2 _ 

2em ^ 2^2 ^ ’ 


em 


em 




x^ + —x"‘ -I- 


m 


i .2 


mu 


-X 


u^ 

jVT^ekl(Cl + Sl) - ^\/I^V(C!+S!; 


(33) 

(34) 
(36) 


The last form fl5^ makes it clear that the -I- modes carry positive energy and 
the — modes carry negative energy. 


2.3 Ostrogradsky’s Construction for N Derivatives 


Consider a Lagrangian L (x, h,..., xX')'^ which depends upon the hrst N 
derivatives of x{t). If this Lagrangian depends nondegenerately upon the 
A^-th derivative x^') then the Euler-Lagrange equation is linear in the 2A^-th 
derivative 



( 36 ) 
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The canonical phase space must therefore possess 2N coordinates which Os- 
trogradsky chooses to be, 


N 


Xi = and Pi = 


dtJ c)a;d) 


(37) 


]=t 


Nondegeneracy means one can solve for in terms of Pn and the Xj’s. 
That is, there exists a function A{Xi ,..., Xat, F/v) such that. 


dL 


dqW 


= P 


a:(N)^A 


N 


(38) 


For general N Ostrogradsky’s Hamiltonian takes the form, 

N 


H ^ 

i=l 


X 


(*) 


L, 


(39) 


= P1X2 + P2X. 


+ • ■ • + Pn-iXi^ + PnA — ..., Xmi -(40) 


The evolution equations are, 

dH 


Xi = 


dPi 


and 


P. = 


dH 


(41) 


It is simple to check that these evolution equations reproduce the canoni¬ 
cal relations fl57)l and the Euler-Lagrange equation fl5^ . The hrst {N — 1) 
equations for Xi verify the dehnition of Xj+i, 


* = l,...,(iV-l) 


X, = X 


i+1 • 


(42) 


The evolution equation for Xjv is similar. 


Xm — Ap P] 


dA 


N 


dP, 


N 


dL dA 
dx^^'> dPN 


A. 


(43) 


The last (X — 1) equations for Pi reproduce the definition of Pi-i, 


i = 2, 


N 


Pi 


dA dL dL dA 

* ^ ^ dXi dx^'‘~d dx^^'^ dXi 

dL 


, (44) 
(45) 
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And the evolution equation for Pi gives the Euler-Lagrange eqaution 


• dA dL dL dA dL 

^ ^ dXi dx^ dxX) dXi dx 


(46) 


Hence (140|) generates time evolution. It is also the Noether current for the 
case where the Lagrangian contains no explicit time dependence. The Hamil¬ 
tonian fl40p is therefore what any physicist would call the energy, up to canon¬ 
ical transformation. 

The Hamiltonian fl40|) is linear in Pi, P 2 , • • • Pn-i- Only with respect to 
Pn might it be bounded from below. For large N the fraction of linear direc¬ 
tions approaches so adding more higher derivatives makes the instability 
worse rather than better. 


3 Nature of the Instability 

Ostrogradsky’s result implies that the Hamiltonian of a nondegenerate higher 
derivative theory is unbounded below, and also above. This section discusses 
the manner in which the instability manifests, and what it implies for fun¬ 
damental theory. Six short subsections make the points: 

1. The Ostrogradskian instability drives the dynamical variable to a spe¬ 
cial kind of time dependence, not a special numerical value. 

2. The same Ostrogradskian dynamical variable carries both positive and 
negative energy creation and annihilation operators. 

3. If a system which suffers from the Ostrogradskian instability interacts, 
then the empty state can decay into a collection of positive and negative 
energy excitations. 

4. If a system which suffers from the Ostrogradskian instability is a con¬ 
tinuum held theory, the vast entropy at inhnite 3-momentum will make 
the decay instantaneous. 

5. For interacting systems which suffer from the Ostrogradskian instabil¬ 
ity, degrees of freedom with large 3-momentum do not decouple from 
low energy physics. 

6 . The imposition of a single, global constraint on the energy functional 
does not ameliorate the Ostrogradskian instability. 




3.1 Kinetic Instability 

Physicists are familiar with instabilities of the potential energy. In this case 
energy is released as the dynamical variable approaches some special valne. 
The Ostrogradskian instability is instead a problem with the kinetic energy, 
and it manifests by the dynamical variable developing a special time depen¬ 
dence. Checking that the energy is bonnded below for constant valnes of the 
dynamical variable in no way establishes that a system is free of the Ostro¬ 
gradskian instability. Consider, for example, the higher derivative oscillator 
(|25|) . Expression fIMD shows that its energy is bonnded below by zero for any 
constant valne of x{t). Negative energies are attained by making x{t) large 
and/or making 'x{t) large while keeping the combination x{t)+^'x{t) hxed. 


3.2 Double Duty for Dynamical Variables 

Physicists are used to resolving linearized dynamical variables into creation 
and annihilation operators. For the harmonic oscillator solution (|T0|1 this is 
done by using the Euler relation to identify a lowering operator proportional 
to and a raising operator proportional to 


Xq cos{ujt) + 


— sm{ojt) 
u 


1 

2 


i . ■ 

Xq H- Xq 

U J 


e-iu^t 



i . ■ 

Xq - Xq 

U J 




( 47 ) 


The usual rule is that each dynamical variable harbors either zero or one set 
of creation and annihilation operators at linearized order. From expression 
(j27|l one can see that the same higher derivative dynamical variable carries 
both positive and negative energy creation and annihilation operators. This 
means that local interactions which involve the dynamical variable necessarily 
couple the two sectors. 


3.3 The Vacuum Can Decay 

Now consider an interacting, continuum held theory which possesses the Os¬ 
trogradskian instability. In particular consider its likely particle spectrum 
about some “empty” solution in which the held is constant. Because the 
Hamiltonian is linear in all but one of the conjugate momenta it is possible 
to arbitrarily increase or decrease the energy by moving diherent directions in 
phase space. Hence there must be both positive energy and negative energy 
particles — just as there are in the higher derivative oscillator fl25|) . As in 
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that point particle model, the same continuum held must carry the creation 
and annihilation operators of both the positive and the negative energy par¬ 
ticles. If the theory is interacting at all — that is, if its Lagrangian contains 
a higher than quadratic power of the held — then there will be interactions 
between positive and negative energy particles. Depending upon the interac¬ 
tion, the empty state can decay into some collection of positive and negative 
energy particles. 

3.4 Entropy Drives Vacuum Decay 

Recall the reason that excited states of atoms decay in nature. It is certainly 
not to reduce the energy of the full system — including the interaction with 
electromagnetism — but rather to redistribute the constant total energy into 
the largest possible class of states. There is one way for the atom not to decay, 
compared with an inhnite number of ways the atom can decay and emit one 
or more recoil photons. Note also that explicit computations of the decay 
time employ vacuum huctuations of the electromagnetic held to provide the 
necessary perturbation. 

Atomic decays have just the hxed energy diherence between the two states 
to apportion, so they are chiehy driven by the arbitrary directions which can 
be taken by the decay products. In contrast, the decay of an interacting, 
nondegenerate higher derivative held theory can involve particles of any en¬ 
ergy, as long as the total sums to zero. So one should think of the decay rate 
as having the same sort of angular factors as an atomic decay at some hxed 
energy, followed by one or more integrals — all the way to inhnity — over 
the magnitudes of the various energies. The volume of phase space is so large 
that these integrations cause the decay to be instantaneous. Indeed, the only 
way people derive hnite decay rates for particles with a kinetic instability is 
by cutting oh the phase space at some point, in which case the rate is dom¬ 
inated by the cutoh, for example |1]. Such a cutoh might make sense if the 
kinetic instability appeared in some nonlocal ehective held theory, but it has 
no place in fundamental physics. 

Note that the decay does not just happen once. It is even more entrop- 
icly favored for there to be two decays, and better yet for more. In fact the 
system instantly evaporates into a maelstrom of positive and negative energy 
particles. Whether or not such a state has a proper mathematical represen¬ 
tation, it certainly does not describe the universe of human experience in 
which all particles have positive energy and empty space remains empty. 
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Note also that this conclusion only follows if the higher derivative theory 
possesses both interactions and continuum particles. The point particle os¬ 
cillator fl25|l has no interactions, so its negative energy degree of freedom is 
unobservable. However, it is conceivable that this higher derivative oscillator 
could be coupled to a discrete system without engendering any instability. 
The feature which drives explosive vacuum decay is the vast entropy of phase 
space. Without that it becomes an open question whether or not there is any¬ 
thing wrong with a higher derivative theory. Of course the physical universe 
seems to be described by continuum held theory down to at least 2.8 x 10“^® 
meters [5], and any observable degree of freedom must interact, or else it 
could not be observed, so these seem to be safe assumptions. 

—^ 

3.5 Large ||/c|| Modes Do Not Decouple 

Physicists are used to ignoring very high energy modes, except for renor¬ 
malizations of low energy parameters. This procedure is quite correct for 
positive energy modes in a stable theory because exciting a mode requires 
energy which must be drawn from de-exciting other modes, and any given 
state only has some hxed amount of energy. However, that justihcation fails 
for a theory which suffers from the Ostrogradskian instability because even a 
very high (positive or negative) energy mode can be excited by also exciting 
modes with the opposite energy. Instead of these large k modes decoupling, 
they couple ever more strongly as k grows, because more and more ways 
open up to balance its energy by exciting lower modes of the opposite sign. 

3.6 Constraints on H Accomplish Nothing 

It is sometimes imagined that the energy of a higher derivative theory decays 
with time. That is not true. Provided one is dealing with a complete system, 
and provided there is no external time dependence, the energy of a higher 
derivative system is conserved, just as it would be under those conditions 
for a lower derivative theory. This conservation is apparent for the higher 
derivative oscillator 0251) from expression 0351) . 

The physical problem with nondegenerate higher derivative theories is not 
that their energies decay to lower and lower values. The problem is rather 
that certain sectors of the theory become arbitrarily highly excited when one 
is dealing with an interacting, continuum held theory which has nondegen¬ 
erate higher derivatives. For example, Boulware, Horowitz and Strominger 
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[ 0 ] showed that the energy is zero for any asymptotically flat solution of the 
higher derivative held equations derived from the Lagrangian, 

H = , (48) 

where Cp^yp^ is the Weyl tensor and R is the Ricci scalar. However, this 
model is still unstable for a 7 ^ 0 , as its creators realizedji] 

4 Quantization 

Quantization is very important to understanding the Ostrogradskian insta¬ 
bility because 0 -point huctuations provide the perturbations needed to ensure 
that the potential for vacuum decay is actually realized. However, a quantum 
higher derivative system has some peculiarities. For example, it is obvious 
from relations (11511161) that position and velocity commute! Further, the wave 
function of a higher derivative theory depends upon position and velocity. 
This section argues hrst that the classical instability survives canonical quan¬ 
tization. After presenting a worked-out example, the curious noncanonical 
quantizations which sometimes appear in the literature are discussed. 

4.1 A Large Phase Space Instability 

It is often imagined that quantization might protect a higher derivative sys¬ 
tem against the Ostrogradskian instability the same way that quantization 
prevents the collapse of atoms coupled to electromagnetism. This is a failure 
to understand correspondence limits. In the Heisenberg picture the equa¬ 
tions of classical mechanics are identical to those of quantum mechanics. It 
also means the very same thing to solve these equations: one expresses the 
dynamical variable in terms of time and the allowed initial value data, as in 
expressions (TO and (127)) . The only difference between classical and quan¬ 
tum mechanics is that the classical initial value data are numbers which can 
take any value whereas the quantum initial value data include noncommuting 
conjugate operators which obey the Uncertainty Principle. The only classical 
phenomena that can be affected by quantization are those whose realization 

^It is also worth noting that the requirement of asymptotic flatness in this model 
would preclude the response to normal matter, and that imposing the correct asymptotic 
condition gives rise to nonzero energy [7]. 
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requires localizing conjugate variables to some volume of the classical phase 
space smaller than h. So quantum atoms are stable because localizing the 
electrons too near the nucleus necessarily induces a large kinetic energy. 

In contrast, the Ostrogradskian instability derives from the fact that P 1 X 2 
can be made arbitrarily negative by taking Pi either very negative, for pos¬ 
itive X 2 , or else very positive, for negative X 2 . This covers essentially half 
the classical phase space! Further, the variables X 2 and Pi commute with 
one another in Ostrogradskian quantum mechanics. So there is no reason to 
expect that the Ostrogradskian instability is unaffected by quantization. 

4.2 Quantum Higher Derivative Oscillator 

Consider the second derivative oscillator (12B]) discussed in section 2.2. There 
can be no ground state in the presence of the Ostrogradskian instability 
but one might dehne an “empty” state wavefunction, VL{Xi,X 2 ) which has 
the minimum excitation in both the positive and negative energy degrees of 
freedom. The procedure for doing this is simple: hrst identify the positive 
and negative energy lowering operators a±, and then solve the equations, 

= 0 = a_|0) . (49) 

One can recognize the raising and lowering operators by expressing the gen¬ 
eral solution fl27p in terms of exponentials, 

Mt) = l(C++*S+)e-“*‘ + 

+i(C_+iS_)e-“-‘ + . (50) 

Recall that the mode carries positive energy, so its lowering operator must 
be proportional to the term, 

a+ ~ C+ + iS+ , (51) 

~ ^(i + %/1^)Xi + *Pi-A:+P2-^(i-v'1^)x2.(52) 

The k- mode carries negative energy, so its lowering operator must be pro¬ 
portional to the term, 

a_ ~ C- — iS_ , (53) 

~ ^(^l-vT^jXi-zPi-A:_P2 + —(^l + x/T^jXa .(54) 
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Writing Pj = 
Q{X,,X,) 


reveals that the unique solution to has the form, 


A^exp 


m 




2h{k+ + k-] 


k+k.Xl + Xl 


iy/em 

h 


X 1 X 2 


• (55) 


The empty wave function (l55|l is obviously normalizable, so it gives a 
state of the quantum system. One can build a complete set of normalized 
stationary states by acting arbitrary numbers of + and — raising operators 
on it. 


lA'+.JV-) 


(4)^' (at)" 


iji). 


(56) 


On this space of states the Hamiltonian operator is unbounded below, just 
as in the classical theory. 


H\N+, N_) = h(^N+k+ - N_k_^ I A^+, N_) . (57) 

This is the correct way to quantize a higher derivative theory. One evidence 
of this fact is that classical conhgurations of negative energy correspond to 
quantum negative energy states. 


4.3 Unitarity versus Instability 


Particle physicists who quantize higher derivative theories do not typically 
recognize a problem with stability; they instead discuss a breakdown of uni¬ 
tarity, for example |H]. This is accomplished by regarding the negative en¬ 
ergy lowering operator as a positive energy raising operator. So one dehnes 
a “ground state” |f2) which obeys the equations, 

a+lH) = 0 = aLlH) . (58) 


The unique wave function which solves these equations is. 


n{x,,X2) 


N exp 


m 




2h{k- — k^ 


k+k_Xl - X\ 


i^em 

H-—^ 1^2 

n 


(59) 


The wave function fl59l) is not normalizable, so it does not correspond to a 
state of the quantum system [9]. However, particle physicists dehne a formal 
“space of states” based upon |f2). 


\N+,N_) 


(q+)^+ (g-)^- 

yio' ' 


(60) 
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Although these wave functions are no more normalizable than f2(Xi,X2), 
they are all positive energy eigenfunctions, 

H\N+,N_) = h(^N+k+ + N_k_^\N+,N_) . (61) 

The problem with unitarity emerges because |f2) is defined to have unit 
norm, but the commutation relations are unchanged, 

[a+,a+] = 1 = [a-,aL] . (62) 

Hence the norm of any state with odd N_ is negative. The first of these 
negative norm states is, 

(MIM) = (HlaLa-in) =-(HlH) . (63) 

The next step is to invoke the probabilistic interpretation of quantum me¬ 
chanics which requires norms to be positive because probabilities are. There¬ 
fore, the negative norm states must be excised from the space of states. 
However, doing that results in a nonunitary S-matrix because scattering pro¬ 
cesses inevitably mix positive and negative norm states, just as the correctly- 
quantized, indefinite-energy theory allows processes which mix positive and 
negative energy particles. 

It is important to note that the potential for invoking noncanonical quan¬ 
tization schemes to change the range of allowed energies is present even in the 
usual, first derivative systems. The Schrodinger equation H'ip{X) = Eijj{X) 
is a second order differential equation, which possesses two linearly indepen¬ 
dent solutions for every value of the energy E. It is only by insisting upon 
normalizable wave functions that quantized energies emerge. Many other 
peculiar things happen if one abandons normalizability mm- In partic¬ 
ular, the Correspondence Principle fails, so that taking h to zero gives a 
different classical system from the one which originally motivated the anal¬ 
ysis. That is the case for PT-symmetric quantizations of higher derivative 
systems mm- 

5 Degeneracy 

The only way anyone has ever found to avoid the Ostrogradskian instability is 
by violating the assumption of nondegeneracy upon which it is based. This 
section discusses three ways this can happen: through partial integration, 
through gauge invariance, and by imposing constraints by fiat [IT] . 
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5.1 Trivial Degeneracy 

The simplest form of degeneracy derives from adding a total derivative to a 
first order system. Examples include the Hilbert action of general relativity, 
Lovelock gravity [15] and Galileons HSIE]. In that case one simply performs 
a partial integration, and discards the surface term to obtain a Lagrangian 
which contains only Erst time derivatives. For example, the 3rd Lagrangian 
for a scalar Galileon 7r(f, T) reduces to first order form as. 


dfj,7Td^7Td‘^7T 


m 


1 o 

-TT — TtVvr-V tt 
3 


+ 27rV7r-V tt + Vtt , 


27rV7r-V7r + V^vrVTT-VTT . 


(64) 

(65) 


Note that it is only necessary to eliminate higher time derivatives; there is 
no problem if the Lagrangian contains higher spatial derivatives, or mixed 
first time and space derivatives. 


5.2 Gauge Degeneracy 

All theories which possess continuous symmetries are degenerate, irrespective 
of whether or not they possess higher derivatives. A familiar example is 
the relativistic point particle, whose dynamical variable is X^{t) and whose 
Lagrangian is, 

L = -m\J. (66) 

The conjugate momentum is. 


D - 

:— • 

One cannot solve (I67|) for X^ in terms of X^ and because the equation is 
homogeneous of degree zero. The continuous symmetry associated with this 
degeneracy is invariance under changes of the parameter r —)■ r', 

X^{j) X'^(r) = X^^(t'~\t)^ . (68) 

The cure for symmetry-induced degeneracy is simply to fix the symmetry 
by imposing gauge conditions. Then the gauge-hxed Lagrangian should no 
longer be degenerate in terms of the remaining variables. For example, one 



16 







might fix the parameter r to obey r = X^{t). In that case the gauge-fixed 
particle Lagrangian is, 


Lqp = —m 


1-X-X , 


(69) 


and the relations for the momenta are simple to invert, 

mXj -P* 


P, = 


1-X-X 


X* = 




m? + P ■ P 


(70) 


When a continuous symmetry is used to eliminate a dynamical variable, 
the equation of motion of this variable typically becomes a constraint. For 
symmetries enforced by means of a compensating field — such as making the 
Hilbert action local Lorentz invariant using the antisymmetric components 
of the vierbein [18], or Weyl invariant using a scalar [19] — the associated 
constraints are tautologies of the form 0 = 0. Sometimes the constraints are 
nontrivial, but implied by the equations of motion. An example of this kind is 
the relativistic particle considered above. In synchronous gauge (r = X°(r)) 
the equation of the gauge-fixed zero-component implies that the Hamiltonian 
is conserved. 



mXn 


= 0 


-r]f,l,X^^X•' 


^[\lnP + P-P] = 0 . 


(71) 


And sometimes the constraints give nontrivial relations between the canonical 
variables that generate residual, time-independent symmetries. In this case 
another degree of freedom can be removed. An example of this kind of 
constraint is Gauss’ Law in temporal gauge electrodynamics. 

When constraints of the third type are present one must check whether 
or not they affect the instability. This obviously depends on the particular 
model being studied but a necessary condition for avoiding the Ostrograd- 
skian instability is that the number of gauge constraints must equal or exceed 
the number of unstable directions in the canonical phase space. Because the 
number of constraints for any given symmetry is fixed, whereas the number of 
unstable directions increases with the number of higher derivatives, it follows 
that gauge constraints can at best avoid instability for some fixed number of 
higher derivatives. 
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A good example of gauge degeneracy is provided by the quadratic cur¬ 
vature model (H5]) which was exhibited at the end of section 3 to show the 
irrelevance of a global constraint on the Hamiltonian. As long as a and {3 are 
both nonzero, there are six independent, higher derivative momenta at each 
space point, whereas there are only four local constraints. If /? = 0 the model 
acquires a new local symmetry — Weyl invariance —- which adds another 
local constraint. Hence there are either one or two unconstrained instabilities 
per space point for a ^ 0. There are an inhnite number of space points, so 
the addition of a single, global constraint does not change anything. 

The case of a = 0 is special. If f5 has the right sign the resulting model 
has long been known to have positive energy pUl |2I] . This result in no way 
contradicts the previous analysis. When a = 0, the terms which carry second 
derivatives are contracted in such way that only a single component of the 
metric carries higher derivatives. So the counting is one unstable direction 
per space point versus four local constraints, which means the constraints 
can prevent the Ostrogradskian instability. 

5.3 Imposed Degeneracy 

Many attempts to evade the Ostrogradskian instability are based on segregat¬ 
ing higher derivatives to interaction terms so that the free theory possesses no 
extra solutions. This renders the instability invisible to perturbative scrutiny 
but does not avoid it. One can see from the construction of section 2 that the 
sole assumption needed to derive the instability is nondegeneracy, irrespec¬ 
tive of how one organizes any approximation technique. On the other hand, 
there is a way of imposing constraints so as to make the theory agree with 
its perturbative development. When this is done there are no more higher 
derivative degrees of freedom, but this constrained version of the theory 
cannot serve to dehne an acceptable model unless the perturbative solution 
converges. 

The technique is to regard higher derivative parts of the Euler-Lagrange 
equation as a perturbation and then use the unperturbed equation to reduce 
the order [22]. Of course this produces a remainder with even more higher 
derivatives, but this remainder is also higher order in perturbation theory. 
By iterating the procedure inhnitely, and then neglecting the remainder, one 
obtains a lower order equation. 

The technique can be illustrated for the higher derivative oscillator fl2^ by 
regarding the parameter e as a coupling constant so that the Euler-Lagrange 
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equation (12^ takes the form, 


X + oJ^x = —e 



2 

X = —eD^x . 


The hrst iteration gives, 


X + oj‘^x = +ex + e^D'^x = —eoj'^x — e^{l — D‘^)D‘^x . 


(72) 


(73) 


After another iteration one obtains. 


X + u x = —e 


u'^x 


l + e(l + e)2(2 + e) 

^^(2 + e)(l + e) - {2 + e)D^ + D^]{l-D^)D^x . (74) 


Continuing in this fashion, and ignoring the remainder, gives, 

X + k‘]^x = 0 . 


(75) 


From the full theory, the perturbative development has retained only the 
solution whose frequency is well behaved for e —?• 0, 


kl = uj^ 


1 + e + 2£2 + 0(t?) 


(76) 


It has discarded the solution whose frequency blows up as e —)■ 0, 

T 




Le 


- 1 - e - 2e^ + 0(6^*) 


(77) 


The perturbative development fl75l) is what results if one changes the 
original theory by imposing the constraints. 


g(t) = -^lQ(i) ^ P2 = y(l-vT^)Xi, (78) 

= ^ X2 = ^(i-VT^)Pi. (79) 

Under these constraints the Hamiltonian becomes, 

T/pert = + ^X^') , ( 80 ) 


which is that of a positive energy harmonic oscillator with mass \/l — 4em 
and frequency k+. If the constraints (I781I79I) are imposed at one instant, they 
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remain valid for all times as a consequence of the full equation of motion 026 p . 
so the constrained model is consistent. This is ultimately a consequence of 
the fact that, for this model, the perturbatve expansion converges. That is 
what ensures that the discarded remainder term really goes to zero when the 
expansion is carried to inhnite order. 

For nonlinear Euler-Lagrange equations it is more difficult to reach a sec¬ 
ond order form, but one can still do it. As before, the ultimate consistency 
of the reduced system depends upon the convergence of the perturbative ex¬ 
pansion. For certain mechanical systems it does converge, for example, a 
higher derivative generalization of a particle moving in a uniform gravita¬ 
tional acceleration g is. 




L = -mx^ + mqx^ - xx' 

2 ^ Qg 


Reducing to second order transforms the higher derivative corrections into a 
distortion of the acceleration. 


^ 1 — Vl —2e —g 1 +-e +-e'^ + 0{e^) . (82) 


However, there are no known, interacting, 3 -|- 1-dimensional held theories 
for which the perturbative expansion converges. Nor has anyone ever found 
a consistent way of imposing constraints which avoids the Ostrogradskian 
instability for an interacting, (3 + l)-dimensional, higher derivative held the¬ 
ory. 

6 Conclusions 

Although it was not apparent in 1850, Ostrogradsky’s theorem can today 
be recognized as the strongest restriction on what sorts of interacting lo¬ 
cal quantum held theories can describe fundamental physics. No symmetry 
principle has a broader scope or comparable power. Its applications include: 

• Demonstrating that higher derivative counterterms cannot be a funda¬ 
mental solution to the problem of quantum gravity |23] : 

• Establishing f{R) models as the only metric-based, local and poten¬ 
tially stable modihcations of gravity | 23 ]; and 
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• Discussing the problems of nonlocal models which can be viewed as the 
limits of an inhnite sequence of higher derivatives |2S] • 

One should also note the recent generalization by Motohashi and Suyama of 
Ostrogradsky’s result to Lagrangian-based systems (such as fermions) whose 
Euler-Lagrange equations involve an odd number of time derivatives [26] , 
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